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DETAILED ACTION 
Continued Examination Under 37 CFR 1.114 

1 . A request for continued examination under 37 CFR 1.114, including the fee set 
forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this 
application is eligible for continued examination under 37 CFR 1.114, and the fee set 
forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action 
has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on May 2, 
2005 has been entered. 

Response to Arguments 

2. Applicant's arguments, see pages 8-9, filed May 2, 2005, with respect to the 
rejection(s)of claim(s) 1 and 15 under 35 U.S.C. 102(b) have been fully considered and 
are persuasive. Therefore, the rejection has been withdrawn. However, upon further 
consideration, a new ground(s) of rejection is made in view of Chu (U.S. Patent 
6,374,210) 

Independent claims 1,15, and 16 have been amended to specifically include the 
limitation that the transcription of textual data is pert'ormed using a recognition system 
that uses a language model and phonetic dictionary of semantic units. Since Nanjo et 
al. explicitly states the method is not dictionary based, the rejections have been 
withdrawn. 
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Claim Rejections - 35 USC § 102 

3. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351 (a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21(2) 
of such treaty in the English language. 

4. Claims 1, 2, 15, 16, 22, and 24 are rejected under 35 U.S.C. 102(e) as being 
anticipated by Chu (U.S. Patent 6,374,210). 

In regard to claims 1,15, and 16, Chu discloses a method and program storage 
device for managing a textual database, the method comprising the steps of: 

receiving textual data (Fig. 1, input means 100 receives an input string of 
connected text, column 5, lines 8-10); 

identifying a data type of the textual data (identification means 120 segments an 
input string using a vocabulary specific to a language and several languages are 
supported, column 5, lines 24-26 and lines 30-33); 

transcribing the textual data into corresponding semantic units of words using a 
recognition system for the identified data type, wherein the recognition system performs 
transcription by decoding the textual data using a language model and phonetic 
dictionary of semantic units (identification means 120 segments the input identification 
data on using a lexicon (dictionary) 122 and language model 124 where the dictionary 
122 and language model 124 are selected according to the language of the textual data, 
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column 5, lines 28-33; the lexicon for segmenting the textual data is based on sub-word 
units, column 6, lines 12-20 and line 64); and 

generating index based on semantic units of words for indexing the textual data 
with the corresponding semantic units (the sequence of possible word candidates, 
which are based on the sub-word units, are used to generate an automatic index, 
column 5, lines 38-42 and column 6, lines 26-30). 

Furthermore, since Chu discloses the semantic units of words are used to create 
an index, the textual data must inherently be stored, since an index, by definition, is a 
data table that points to stored information. 

Still further, Chu discloses the recognition system comprises an OCR (optical 
character recognition) system for transcribing typed text (column 5, lines 20-23), and an 
AHR (automatic handwriting recognition system) for transcribing handwritten text 
(column 5, lines 43-47). 

In regard to claim 2, Chu discloses the semantic units comprise syllables (column 
6, lines 12-20). 

In regard to claim 22, Chu discloses identifying a data type of the textual data 
comprises identifying types including handwritten (column 5, lines 43-47) and typed text 
(column 5, lines 20-23). 
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In regard to claim 24, Chu discloses the recognition system comprises an OCR 
(optical character recognition) system for transcribing typed text (column 5, lines 20-23), 
and an AHR (automatic handwriting recognition system) for transcribing handwritten text 
(column 5, lines 43-47). 

Claim Rejections - 35 USC § 103 

5. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not Identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

6. Claims 3, 8, 10-12, 14, 19-21, 25 and 26 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Chu, in view of Umemoto (U.S. Patent 6,470,334). 

In regard to claim 3, Chu discloses the semantic units comprise any linguistically 
based sub-word unit (column 6, lines 12-16). 

Chu does not disclose that the semantic units comprise morphemes. 

Umemoto discloses a method for creating an index to search documents that 
analyzes an input document (textual data) by morpheme analysis to index the 
documents by basal words (morphemes, column 8, lines 30-41). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Chu to index input textual data based on morphemes in order to 
index languages such as Japanese, which does not clearly articulate breakpoints 
between words. 
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In regard to claim 8, Chu does not disclose the step of generating an index 
comprises generating a hierarchical index where a semantic unit index points to one or 
more data modes. 

Umemoto discloses a hierarchical index where a semantic unit index points to 
one or more data modes (the word address is stored to register every word in 
sequential order, column 9, lines 1-7 and lines 15-19). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Chu to generate a hierarchical index where a semantic unit index 
points to one for more data modes, in order to provide an index of smaller capacity so 
as to enable faster access in a retrieval search, as taught by Umemoto (column 15, 
lines 13-17). 

In regard to claims 11, 12, 20, and 21 , Chu discloses the step of generating an 
index (column 5, lines 38-42), which implies that textual data corresponding to the index 
would be searched. 

Chu does not disclose searching the textual database for target textual data 
using the semantic index. 

Umemoto discloses searching the textual database for target textual data using 
the semantic index (column 7, lines 39-43). Furthermore, a target word must 
necessarily be converted into a string of semantic units to search the index, because 
the index comprises semantic units found in the input textual data. Therefore a target 
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word must also be converted to semantic units in order to match relevant semantic unit 
entries in the index. Additionally, Umemoto discloses an automatic word boundary 
marking system that is applied to a search query (words in the input query are 
searched, column 7, lines 39-43). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Chu to search the textual database using the semantic index, so that 
documents in languages such as Japanese, which does not clearly articulate 
breakpoints between words, could be searched, as taught by Umemoto (column 15, 
lines 1-9). 

In regard to claim 14, Chu does not disclose displaying search results. 

Umemoto discloses the results of a search are displayed (column 7, lines 47-53). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Chu to display the results of the search so the user could view the 
results. 

Neither Chu nor Umemoto specifically disclose the target textual data is 
displayed starting from a corresponding semantic unit in a user query and commencing 
one of forward and backward for a given length based on a user request. 

Official notice is taken that it is notoriously well known in the art to display search 
results with the target search result as well as surrounding textual data so that the user 
can determine the context in which the search result is used in the original document. 
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It would have been obvious to one of ordinary skill in the art at the time of 
invention to further modify the combination of Chu and Umemoto to display fonA/ard or 
backward for a given length from the target textual data so that the user can determine 
the context in which the search result is used in the original document. 

In regard to claims 10, 19, 25, and 26, neither Chu nor Umemoto disclose 
generating separate indexes for each data type, then converging the separate indexes 
for each data type into one universal index. 

Official notice is taken that it is notoriously well known in the art to create 
separate indexes for each data type, so a user can restrict a search to one particular 
data type. Furthermore official notice is taken that it is notoriously well known in the art 
to converge separate indexes, so a user can search all available data types with one 
search entry. 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to further modify the combination of Chu and Umemoto to generate a separate 
index for each data type and to converge the separate indexes into a universal index, so 
a user would have the flexibility to search data types individually or search all data types 
at once. 

7. Claims 4 and 5 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Chu, in view of Holt et al. (U.S. Patent 5,960,447). 
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Chu does not disclose the textual data is associated with audio data and indexing 
comprises indexing the audio data with the semantic units or time-stamping the 
semantic units. 

Holt et al. discloses a tagging and editing system that links textual data (word 
processor file Fig. 2, 60) to an audio file (53). Each semantic unit (word) in the textual 
data (word processor file 60) is indexed in the audio file (column 4, lines 1-18). The 
semantic units (words) are time-stamped (a time code pointing to a particular starting 
point in the audio file) (column 4, lines 5-7). A recognition system (52) receives speech 
as an input from the microphone (50) and transcribes the speech to textual data (text 
words) (column 3, lines 16-20). A speech recognition system typically utilizes a 
language model based on semantic units (e.g. phonemes in a HMM word model). 

Adding Indexes to textual data transcribed with a recognition system 
corresponding audio to data that Is time-stamped, as taught by Holt et al., to a system of 
managing a textual database would allow the playback of associated audio for each 
recognized semantic unit, thereby helping in correction and proof reading of a textual 
database, as taught by Holt et al. (column 4, lines 29-31 ). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to add time-stamped indexes to audio data corresponding to the textual data 
In order to help in the correction and proofreading of a textual database. 

8. Claim 13 is rejected under 35 U.S.C. 103(a) as being unpatentable over Chu, in 
view of Umemoto, and further in view of Chang et al. (U.S. Patent 5,268,840). 
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Neither Chu nor Umemoto disclose a target word is converted using a character- 
to-semantic unit mapping table. 

Chang et al. disclose a character-to-semantic unit mapping table (Fig. 6, column 
7, line 65 to column 8, line 8). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to further modify the combination of Chu and Umemoto to use a character-to- 
semantic unit mapping table, in order to provide an efficient method for morphologizing 
text (i.e. convert from characters to semantic units), as taught by Chang et al. (column 
4, lines 65-67). 

9. Claim 23 is rejected under 35 U.S.C. 103(a) as being unpatentable over Chu, in 
view of Vinsonneau et al. (U.S. Patent 5,319,745). 

Chu does not disclose different data types include handwritten text or typed text 
of different font or styles of a given language. 

Vinsonneau et al. disclose a method for scanning and indexing text that identifies 
different fonts of a given language (column 1 0, lines 45-49). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Chu to identify different fonts of a given language, so'that the fonts 
could be indexed, thereby allowing a user to limit their search of textual data by font. 

10. Claim 27 is rejected under 35 U.S.C. 103(a) as being unpatentable over Chu, in 
view of Syeda-Mahmood (U.S. Patent 5,953,451). 
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Chu does not disclose indexing the semantic units to stored handwritten textual 
data based on handwriting biometric data. 

•Syeda-Mahmood disclose a method for scanning and indexing text that indexes 
according to handwriting biometric data (orientation, skew, intra-word separation of a 
single author, column 3, lines 2-5 and lines 36-38). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Chu to index the semantic units based on handwriting biometric 
data, so that a user could limit their search of textual data to a certain individual. 

1 1 . Claim 28 is rejected under 35 U.S.C. 103(a) as being unpatentable over Chu, in 
view of Umemoto, and further in view of Vinsonneau et al. 

Neither Chu nor Umemoto disclose the one or more modes of data comprises 
words or pictures. 

Vinsonneau et al. disclose a method for scanning and indexing text that includes 
a pointer to words and pictures (words in the text are indexed as well as the location of 
the words in the initial image from which the textual data is derived, column 10, lines 45- 
54). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to further modify the combination of Chu and Umemoto to include pointers in 
the index to words and pictures, so the words could be associated with the original 
image files from which they were derived, and thus subsequently searched. 
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Conclusion 



12. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. Kuo (U.S. Patent 6,879,951) discloses a system which indexes 
input text by syllables. Tada et al. (U.S. Patent Application Publication 2003/020021 1 ) 
disclose a system for indexing input text by morphemes. 

1 3. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Brian L. Albertalli whose telephone number is (571) 272- 
7616. The examiner can normally be reached on Mon - Fri, 8:00 AM - 5:30 PM, every 
second Fri off. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Wayne Young can be reached on (571 ) 272-7582. The fax phone number 
for the organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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